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(54) Title: METHOD FOR POLYNUCLEOTIDE SEQUENCING 
(57) Abstract 

The present invention is a method of sequencing polynucleotides by sequential step sequencing. Sequential step sequencing begins 
with a single-stranded polynucleotide template that is annealed with a primer forming a primer-template complex. In one embodiment 
of the method, one labeled nucleotide at a time is added to the primer-template complex. Alternatively, this method can be used with a 
nucleotide or nucleotide analog that docs not stop addition after a first nucleotide or nucleotide analog has been added to the primer. Other 
embodiments of the invention involve identification of polynucleotides having a contiguous non-redundant string or a superimposed non- 
redundant string pattern. The detection of a non-redundant contiguous string can be used to identify a particular gene. Alternatively, if the 
non-redundant contiguous string is not unique to a particular gene, the string can be used to form a DNA library that can then be searched 
with a second string. Similarly, a superimposed non-redundant string pattern can be used to identify a gene or to search a DNA library. 
Such strings can be used in annealing reactions and in computer searches of a database having a catalog of sequenced polynucleotides. 
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METHOD FOR POLYNUCLEOTIDE SEQUENCING 
Related Co-Pending U.S. Patent Applications 
This patent application is being concurrently filed with the following related 
U.S. patent applications: 'Nuclease Protection Assays,'' R. Kumar, inventor, 
5 Attorney Docket No. DSRC 12038; "Microfluidic Method for Nucleic Acid 
Amplification," Z. Loewy and R. Kumar, inventors, Attorney Docket No. DSRC 
12050; "Method for Amplifying a Polynucleotide," Z. Loewy, inventor, Attorney 
Docket No. DSRC 12081; "Automated Nucleic Acid Preparation" D. Southgate and 
Z. Loewy, inventors, Attorney Docket No. DSRC 12120; and "Padlock Probe 
10 Detection," R. Kumar, inventor, Attorney Docket No. 317913/12162. This patent 
application is related to the following copending U.S. patent applications: Ser. 
No.60/009517, filed November 3, 1995; Ser. No. 60/00602, filed November 3, 
1995; and Ser. No. 60/010513, filed January 24, 1996. All of the foregoing patent 
applications are hereby incorporated by reference herein in their entirety. 
15 This invention was made with U.S. Government support under Contract No. 

70NANB5H1037. The U.S. Government has certain rights in this invention. 

In one aspect, the present invention provides a new method for determining 
the base sequence of RNA or DNA, termed sequential step sequencing. In 
another aspect, the present invention provides new methods of identifying a 
2 0 polynucleotide or polynucleotides using a contiguous string of non-redundant 
nucleotides or a superimposed non-redundant string pattern. 

Prior art methods of sequencing include the Maxam-Gilbert method and the 
Sanger dideoxy method. In the Maxam-Gilbert method, a substrate DNA is 
labeled on one strand with 32 P at the 5'-hydroxyl terminus. The labeled DNA is 

2 5 then broken preferentially at one of the four nucleotides using one reaction 

mixture for each base, the reaction conditions causing an average of one break per 
DNA molecule. In the reaction mixture for each base, each broken chain yields a 
radiolabeled fragment extending from the 32 P 5'-hydroxyl terminus to one of the 
positions in the DNA in which that base appears. Thus, every time a base 

3 0 appears in a DNA molecule, it generates a fragment of a different size, which are 

then separated by gel electrophoresis. The autoradiogram of a gel in which all 
four chemical reactions have been entered into the gel shows a pattern of bands 
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from which the sequence of the DNA can be read. See, for example, Stryer, 
Biochemistry (3d ed. 1988) at pages 120-121. 

In the Sanger dideoxy method, DNA is sequenced by generating fragments 
through the controlled interruption of enzymatic replication. First, a primer is 
constructed which is complementary to the DNA sequence. Then, DNA 
polymerase is used to copy a sequence of a single-stranded DNA using the primer 
and four labeled deoxyribonucleoside triphosphates and a 2\3'-dideoxy analog of 
each of the triphosphates. The incorporation of an analog in the new DNA strand 
being synthesized results in the termination of incorporation of labeled 
deoxyribonucleoside triphosphates since the dideoxy analogs lack the 3'-hydroxyl 
terminus needed to form the next phosphodiester bond. Thus, the synthesis 
results in DNA fragments of various lengths in which the dideoxy analog i s at the 
•3' end. The reaction mixture for each base can then be separately electrophoresed 
on a gel or electrophoresed together if the deoxyribonucleoside triphosphate 
corresponding to each base has a separate label. See. for example, Stryer. 
Biochemistry (3d ed. 1988) at pages 121-123. 

The sequential step sequencing methods of the invention, unlike the above- 
described sequencing methods, involve individual reactions for each type of base 
in every position in which it appears on the DNA molecule, or for every position in 
which it appears next to a different type of base. The large number of reactions 
involved would likely have been considered impractical due to being too labor- 
intensive under the procedures known in the prior art for performing the 
necessary chemical reactions. 

However, the sequential step sequencing methods of the invention can be 
used, for example, in the context of a mkrofluidics-based device for automatedly 
moving fluids in and out of a reaction chamber, which has been disclosed in U.S 
Patent Serial Number 60/010513, filed January 24, 1996, the contents of which 
are incorporated herein by reference. This combination of the microfluidics-based 
system and the methods of the invention makes sequential step sequencing an 
attractive alternative to known conventional methods of nucleotide sequencing. 
Furthermore, the present invention provides an advantage in eliminating the 
need for electrophoresis, which is one of the most time-consuming steps of the 



sequencing reactions of the prior art. Additionally, the present invention provides 
for an increased rate of sequence read-out since nucleotide addition can occur, for 
example, at 800 nucleotides per minute. 

Further, the present invention provides the advantage, for example, of 
5 providing a mechanism for more accurate determination of sequences, such as the 
sequence adjacent to the poly-A tail of a polynucleotide. 

SUMMARY OF THE INVENTION 
In one aspect, the present invention provides a method of sequential step 
sequencing of a polynucleotide having x number of nucleotides comprising: 
1C (A) providing a single-stranded polynucleotide template and a first 

complementary primer having n nucleotides, wherein n is an integer greater than 
three; 

(B) causing the template and the primer to anneal, thereby forming a 
template-primer complex; 
15 (C) adding a template-dependent nucleotide polymerase and at least one 

nucleoside triphosphate or analog thereof having a label attached thereto, 
wherein the nucleoside triphosphate or analog thereof includes a base selected 
from the group consisting of adenine, thymine, cytosine, guanine, and uracil; and 
(D) determining whether a label is associated with the template-primer 
2 0 complex or which label is associated with the template-primer complex. 

DETAILED DESCRIPTION 

DEFINITIONS 

The following terms shall have the meaning set forth below: 

• sequential step sequencing - sequencing a polynucleotide by individual 

2 5 reactions, each reaction detecting no more than the sequence of one type of 

nucleotide at a time. The nucleotide detected can occur, for example, once or more 
than once in adjacent positions, such as G, GG or GGGGG (SEQ ID NO: 1). 

• actual non-redundant contiguous string - a base sequence in which the 
sequence of each base is actually not repeated in the immediately adjacent base. 

3 0 For example, TACATGTACTGCT (SEQ ID NO: 2) is an actual non-redundant 

contiguous string, whereas TA ACATGTACTGCT T (SEQ ID NO: 3) is not, 
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although the underlined sequence within this sequence, ACATGTACTGCT (SEQ 
ID NO: 4), is an actual non-redundant contiguous string. 

• superimposed non-redundant string pattern - pattern derived from an 
actual sequence. In the pattern, the redundancies in the actual sequence are 
5 removed, a redundancy being the duplication of a base in the immediately 
adjacent base. For example, given an actual sequence 

CATTAAAGGGAAAAGCCCAGTCA (SEQ ID NO: 5), the superimposed non- 
redundant string pattern of the sequence is CATAGAGCAGTCA ( SEQ ID NO: 6). 
In one aspect, the present invention relates to methods of sequencing of 

10 polynucleotides using sequential step sequencing. The sequential step sequencing 
methods of the invention, unlike the methods of the prior art, involve individual 
reactions for each type of base in every position in which it appears on the DNA 
molecule, or for every position in which it appears next to a different type of base. 
The methods of the invention are preferably used in the context of a microfluidics- 

15 based device for automatedly moving fluids in and out of a reaction chamber, 
which has been disclosed in U.S. Patent Serial Number 60/010513, filed January 
24, 1996, the contents of which are incorporated herein by reference. The present 
invention provides an advantage over prior art methods, for example, in 
eliminating the need for electrophoresis, which is one of the most time-consuming 

2 0 steps of sequencing reat ions. Additionally, the present invention provides for an 
increased rate of sequ mce read-out since nucleotide addition can occur, for 
example, at 800 nucleotides per minute. 

The methods of the invention begin with the provision of a single-stranded 
polynucleotide template that is annealed with a primer, forming a template- 

2 5 primer complex. In on* 3 aspect, the present invention provides for adding one 

nucleoside triphosphate or analog thereof at a time to the template-primer 
complex, the nucleoside triphosphate or analog being labelled. If necessary, each 
of the four nucleoside triphosphates or analogs is added until a label islietected 
due to the incorporation of a nucleotide into the complex. This method can be 

3 0 used with a nucleoside triphosphate analog that is modified to preclude any 

subsequent addition of such analog after a first analog has been added to the 
primer, such as a dideoxynucleoside triphosphate. Alternatively, this method can 



be used with a nucleoside triphosphate or analog thereof that does not stop 
addition after a first nucleotide or nucleotide analog has been added to the 
primer. 

In other aspects of the invention, all four nucleoside triphosphates or analogs 
are added at once to the complex, each nucleoside triphosphate or analog having a 
different label. Using the latter method, the sequence at this position is 
determined by identifying the type of label incorporated into the complex. This 
method is preferably used only with a nucleoside triphosphate analog that is 
modified to preclude any subsequent addition of such analog after a first analog 
has been added to the primer. 

In one aspect of the invention, when a nucleoside triphosphate or analog is 
used in the sequential step sequencing methods of the invention, in which the 
nucleoside triphosphate or analog is not modified to preclude any subsequent 
addition of such nucleoside triphosphate or analog after a first nucleotide has 
been added to the primer, the sequence obtained may not match the actual 
sequence of the polynucleotide. Instead, the sequence obtained may be a 
superimposed non-redundant string pattern. Specifically, if the polynucleotide 
sequence has a redundancy such that immediately adjacent bases are the same, 
the sequence obtained using the latter methods of the invention will only detect 
one of the bases. For example, a polynucleotide with a sequence of 
CATTAAAGGGAAAGCCCAGTCA (SEQ ID NO:5) will be detected as the 
corresponding superimposed non-redundant string pattern, CATAGAGCAGTCA 
(SEQ ID NO: 6). Thus, in one aspect, the methods of the invention provide for the 
detection of a superimposed non-redundant string pattern in a polynucleotide 
template. 

In another aspect of the invention, when a nucleoside triphosphate or analog 
is used in the sequential step sequencing methods of the invention, in which the 
nucleoside triphosphate or analog is not modified to preclude any subsequent 
addition of such nucleoside triphosphate or analog after a first nucleotide has 
been added to the primer, the sequence obtained will match the actual sequence 
of the polynucleotide when the amount of label attached to the nucleotide is 
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quantified, using for example, autoradiography followed by scanning of the 
autoradiogram to determine signal strength. 

More generally, in one aspect, the present invention provides a method of 
sequential step sequencing of a polynucleotide having x number of nucleotides 
5 comprising: 

(A) providing a single-stranded polynucleotide template and a first 
complementary primer having n nucleotides, wherein n is an integer greater than 
three; 

(B) causing the template and the primer to anneal thereby forming a 
1 0 template-primer complex; 

(C) adding a template-dependent nucleotide polymerase and at least one 
nucleoside triphosphate or analog thereof having a label attached thereto, 
wherein the nucleoside triphosphate or analog thereof includes a base selected 
from the group consisting of adenine, thymine, cytosine, guanine, and uracil; 

15 (D) determining whether a label is associated with the template-primer 

complex or which label is associated with the template-primer complex. 

Peferably, the above method also includes removing unincorporated 
nucleoside triphosphate or analog from the template-primer complex. 

In certain embodiments of the above method, step (C) is limited to using one 
2 0 nucleoside triphosphate or analog thereof, and if no label is associated with the 
template-primer complex as determined in step (D), then steps (A) to (D) are 
repeated using another nucleoside triphosphate or analog thereof having a 
different base than that used previously in step (C), steps (A) to (D) being 
repeated until it is determined that a label is associated with the template-primer 

2 5 complex. 

Preferably, the above methods further comprise step (E), step (E) being, upon 
having determined which base was added to the first primer by exercise of step 
(D), a second primer is generated having n+y nucleotides, y being one or the 
number of identical adjacent nucleotides, wherein the added nucleotide is at its 3' 

3 0 end; and steps (A) to (D) are repeated with the proviso that the second primer is 

substituted for the first primer. Step (E) is preferably repeated until the primer is 
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at least x nucleotides long, such that n=x, n being the number of nucleotides in 
the polynucleotide being sequenced. 

In certain embodiments, the methods of the invention involve the use of a 
nucleoside triphosphate analog modified to preclude any subsequent addition of 
5 such analog after a first analog has been added to the primer. 

When such nucleoside triphosphate analog is used with the methods of the 
invention, in addition to adding one labeled nucleoside triphosphate analog at a 
time, the methods also include the addition of more than one labeled nucleoside 
triphosphate analog at a time. Thus, in certain embodiments, the nucleoside 
10 triphosphate analog of step (C) is a combination comprising two, three, or four 
different nucleoside triphosphate analogs having different bases, wherein the 
nucleoside triphosphate analogs having different bases are differentially labeled. 
Preferably, the differentially labeled nucleoside triphosphate analogs are labeled 
with fluorescent dyes, such as fluorescein, rhodamine, 7-amino-4- 
15 methylcoumarin, dansyl chloride, Cy3, Hoechst 33258, R-phycoerythrin, Quantum 
Red™, Texas Red, suitable analogs and derivatives thereof, and the like, which 
can beTibtained commercially, such as from Sigma. 

When a combination of two, three or four different nucleoside triphosphate 
analogs are used, it is preferred to use a combination of all four different 
2 0 nucleoside triphosphate analogs are used in concert, the four nucleoside 
triphosphate analogs having the bases adenine, thymine, cytosine and guanine if 
the polymerase is DNA-dependent, or the bases adenine, uracil, cytosine and 
guanine if the polymerase is RNA-dependent. 

The above-described sequential step sequencing methods of the invention 

2 5 preferably include step (E), step (E) being, upon having determined which base 

was added to the first primer by exercise of steps (A) to (D), a second primer is 
generated having n+1 nucleotides, wherein the added nucleotide or nucleotides 
are at its 3' end; and steps (A) to (D) are repeated with the proviso that the second 
primer is substituted for the first primer. In certain preferred embodiments, step 

3 0 (E) is repeated until n is x nucleotides long, x being the number of nucleotides in 

the polynucleotide being sequenced, thereby providing for a full sequence. 
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The methods of the present invention can be used, for example, to sequence 
the 3' end of an mRNA or a cDNA nucleotide sequence. Using sequential step 
DNA sequencing and a poly-T or a poly-U primer, nucleoside triphosphates or 
analogs thereof having the bases thymine or uracil are added until the primer is 
extended to the beginning of the poly-A tail. The sequence from this point is then 
determined using the methods of sequential step sequencing described above, and 
can be used to determine the actual sequence, or a superimposed non-redundant 
string pattern. This aspect of the invention overcomes the problems associated 
with the prior art sequencing methods, such as the presence of a smear on the 
sequencing gel since a poly-T or poly-U primer randomly anneals to different 
parts of the poly-A tail. 

Specifically, the above-described methods of sequential step sequencing can be 
used to sequence polynucleotides adjacent to a poly-A tail of the template, the 
methods further comprising the following steps prior to providing the first primer 
in step (A): 

(a) providing a single-stranded polynucleotide template and an initial 
complementary primer, the template having a poly-A sequence, and the primer 
being a poly-T or a poly-U primer; 

(b) causing the template and the primer to anneal, thereby forming a 
template-primer complex; 

(c) adding a template-dependent nucleotide polymerase and an nucleoside 
triphosphate or analog thereof including a base, the base being thymine or uracil, 
thereby forming an elongated initial primer. 

In certain embodiments, the sequenced polynucleotides near the poly-A tail of 
the template are contiguous, and optionally, the sequenced nucleotides form a 
non-redundant contiguous string. In other embodiments, the sequenced 
polynucleotides near the poly-A tail of the template are non-contiguous and form 
a superimposed non-redundant string pattern. 

In certain embodiments, the first primer used in step (A) is the elongated 
initial primer, the primer being complementary to the 5' end of the poly-A 
sequence in the template polynucleotide. In other embodiments, the elongated 
initial primer is used to determine at least one nucleotide of the template adjacent 



to # the poly-A sequence, and the first primer used in step (A) is complementary to 
this nucleotide and at least a portion of the poly-A sequence. 

Preferably, the primers used in the methods of the invention are about 10 to 
about 50 nucleotides long, and more preferably, about 15 to about 30 nucleotides 
5 long. 

In preferred embodiments of the sequencing methods of the invention, the 
label attached to the nucleoside triphosphate or analog is preferably selected from 
the group consisting of a radioisotope, a fluorescent dye, a signal-generating 
enzyme, and a first ligand that specifically binds to a second ligand comprising a 
10 radioisotope, a fluorescent dye or a signal-generating enzyme, and most 
preferably, the label is a fluorescent dye. Suitable radioisotopes include, but are 
not limited to, 3 H, 14 C, and 32 P. Suitable fluorescent dyes include, but are not 
limited to, fluorescein, rhodamine, 7-amino-4-methylcoumarin, dansyl chloride, 
Cy3, Hoechst 33258, R-phycoerythrin, Quantum Red™, Texas Red, suitable 

1 5 analogs and derivatives thereof, and the like. Suitable signal-generating enzymes 

include, but are not limited to, alkaline phosphatase, peroxidase, and urease. Any 
of the aforementioned labels can be obtained commercially, such as from Sigma. 

For instance, labeling methods are described in: Sinha and Striepeke, 
"Oligonucleotides with Reporter Groups Attached to the 5' Terminus" in 

2 0 Oligonucleotides and Analogues: A Practical Approach, Eckstein, Ed., IRL, 

Oxford, 1991, p. 185 et seq.\ Sinha and Cook, "The Preparation and Application of 
Functionalized Synthetic Oligonucleotides: 3. Use of H-Phosphate Derivatives of 
Protected Amino-Hexanol and Mercapto-Propanol or Mercapto-Hexanol," Nucleic 
Acids Research, 1988, Vol. 16, p. 2659 et seq.\ Haugland, Molecular Probes 

2 5 Handbook of Fluorescent Probes and Research Chemicals, Molecular Probes, Inc., 

Eugene, OR, 1992, p. 20 et seq.\ Theisen et al., "Fluorescent Dye Phosphoramidite 
Labelling of Oligonucleotides," Tetrahedron Letters, 1992, Vol. 33, p. 3036 et seq.\ 
Rosenthal and Jones, "Genomic Walking and Sequencing by Oligocassette 
Mediated Polymerase Chain Reaction," Nucleic Acids Research, 1990, Vol. 18, p. 

3 0 3095 et seq.\ Smith et al., "The Synthesis of Oligonucleotides containing an 

Aliphatic Amino Group at the 5' Terminus - Synthesis of Fluorescent DNA 
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Primers for Use in DNA-Sequence Analysis," Nucleic Acids Research, 1985. Vol. 
13, 2399 et seq. 

The detection used in conjunction with the invention will depend on the 
nature of the label. Where a colorimetric or fluorescent label is used visual 
5 inspection or an optical instrument such as the fluorescence microscope from 
Olympus (Lake Success, NY), the Plate Reader device from BioTek Instruments 
(Winooski, VT). and the CCD (charge-coupled device) camera from Princeton 
Instruments (Princeton, NJ). Where radioisotopes are used, detection can 
comprise such spatially sensitive detection devices as the Phosphor Imager device 

10 (Molecular Dynamics, Sunnyvale, CA), or can comprise separately detecting 
individual solid surfaces in a detection apparatus such as a gamma-counter or a 
liquid scintillation counter. 

Further, the template-primer complex is preferably attached to a solid 
surface, such as a microparticle, which is preferably paramagnetic. A 

15 microparticle can have any shape, and preferably it is spherical. Preferably, it 
has a diameter of less than 1 mm, and more preferably, less than 500 microns. In 
certain prefererred embodiments, the microparticles have a diameter from about 
0.5 micron to about 25 microns, and more preferably about 1 micron to about 5 
microns, and even more preferably, about 2 microns to about 4 microns. 

2 0 Microparticles are comprised of any suitable material, the choice of material being 
guided by its characteristics, which preferably include minimal non-specific 
absorptive characteristics, such as that of polystyrene. In other embodiments, the 
microparticles are comprised of, for example, plastic, glass, cellulose, a cellulose 
derivative, nylon, polytetrafluoroethylene ("TEFLON"), ceramic and the like. A 

2 5 paramagnetic bead can be comprised of, for example, iron dispersed in a 

polystyrene matrix. A paramagnetic bead can be comprised of, for example, iron 
dispersed in a polystyrene matrix, and can be obtained with an associated 
biomolecule, for example, from Dynal (Oslo, Norway), or without an associated 
biomolecule, for example, from Bang Laboratories (Carmel, Indiana). 

3 0 Additionally, in preferred embodiments, the template-dependent nucleotide 

polymerase is a DNA polymerase or an RNA polymerase or a fragment thereof 
having polymerase activity. Most preferably, the DNA polymerase or a fragment 
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thereof having polymerase activity is T7 DNA polymerase, the Klenow fragment 
of E.coli DNA polymerase I or Tag polymerase and the RNA polymerase or a 
fragment thereof having polymerase activity is derived from E.coli or S.cerevisiae. 

Furthermore, the modified nucleoside triphosphate is preferably a 
dideoxynucleoside triphosphate. Reaction conditions for the methods of the 
invention can be found, for example, in EP 0 223 618 and Maniatis et al., 
Molecular Cloning (Cold Spring Harbor 1982) which are hereby incorporated by 
reference herein in their entirety. Additionally, where methodologies are referred 
to herein without specific enumeration of well-known methods steps, generally, 
the following text can be referenced for further details: Ausubel et al., Short 
Protocols in Molecular Biology; Sambrook et al., DNA Cloning, A Laboratory 
Manual; an d Molecular Biology Protocols, web-site: 

listeria.nwfsc.noaa.gov/protocols.html. 

In preferred embodiments, the methods of the invention are used in the 
context of a microfluidics-based device for automatedly moving fluids in and out of 
a reaction chamber, which has been disclosed in U.S. Patent Serial Number 
60/010513, filed January 24, 1996, the contents of which are incorporated herein 
by reference. The microfiuidics device is designed specifically for moving small 
volumes of fluids through fluid exchange channels that connect various sorts of 
fluid chambers. In particular, such a device comprises a fluid chamber, which is a 
generic term that describes chambers designed for storage of fluid reagents or 
reactants, i.e., a supply chamber, for locating reactants undergoing a reaction, i.e., 
a reaction chamber, for measuring a volume of a fluid, i.e., a metering chamber, 
and more. More particularly, the device includes a reaction chamber. The 
reaction chamber is comprised of any suitable material, as are all fluid chambers, 
such as, for example, glass, plastic, ceramic, or combinations thereof, and is 
connected to at least two fluid exchange channels for passaging material in and 
out of the reaction chamber. The reaction chamber preferably remains at a 
constant temperature of within about two degrees centigrade, wherein the 
temperature is between about 20°C and 65°C, and alternatively can have 
adjustable temperatures as in accordance with the requisites of the reactions to 
take place therein. 
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The liquid distribution system can conduct synthesis in a great number of 
separate reaction wells, such as 10,000 reaction wells. The synthesis in each 
reaction well can occur on a bead or microparticle or can occur on the surfaces of 
the wells, where these mrfaces have been appropriately treated. The wells are 
formed on a plate thai is separable from the portions of the liquid distribution 
system used to shuttle reagents to a multitude of reaction wells. Another way of 
forming an array is to rpply the photolithographic synthesis procedures described 
in a number of patent: and patent applications owned by Affymax, Inc. These 
include Fodor et al.. "Very Large Scale Immobilized Polymer Synthesis," 
WO92/10092; Dovor al., "Method of Synthesizing Diverse Collections of 
Oligomers," WO93/0G ! 2l; Campbell et al., "Methods for Synthesis of 
Phosphonate Esters," U.S. Pat. 5.359,115; Campbell, -Methods for Synthesis of 
Phosphonate Esters," U.S. Pat. 5,420.328; Fodor et al., "Very Large Scale 
Immobilized Polymer Synthesis," U.S. Pat. 5,424,186; and Pirrung et al., "Large 
Scale Photolithographic Solid Phase Synthesis of Polypeptides and Receptor 
Binding Screening Thereof," U.S. Pat. 5,143,854. 

In another aspect, the methods of the invention involve the identification of a 
polynucleotide or polynucleotides having a contiguous non-redundant string or a 
superimposed non-redundant string pattern. The detection of the presence of a 
non-redundant contiguous string can be used, for example, to identify a particular 
gene. Alternatively, for example, if the non-redundant contiguous string is not 
unique to a particular g<me, the string can be used to form a DNA library that can 
then be searched, for example, with a second string. Similarly, a superimposed 
non-redundant string pattern can be used, for example, to identify a gene or to 
search a DNA library. Preferably the string is at least about 10 nucleotides long, 
and more preferably, the string is at least about 12 nucleotides long. 

Specifically, one method of identifying a polynucleotide or a group of 
nucleotides, comprises: 

(A) providing a primer complementary to a contiguous string of non- 
redundant nucleotides, said primer having a label attached thereto; 

(B) providing a single-stranded polynucleotide template; 
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(C) causing the template and the primer to anneal, thereby forming a 
template-primer complex; 

(D) determining whether a label is associated with the template-primer 
complex or which label is associated with the template-primer complex. 

Another method of identifying a polynucleotide or a group of polynucleotides 
comprises: 

(A) providing a base sequence of a string, the string being a superimposed 
non-redundant string pattern or a contiguous non-redundant string; and 

(B ) searching a computer database of polynucleotide base sequences using 
the base sequence of the string. In embodiments wherein the string is a 
superimposed non-redundant string pattern, the above method preferably further 
comprises providing a computer program for searching for the superimposed 
string pattern in the polynucleotide sequences, the computer program being 
capable of identifying a superimposed string pattern despite the presence of a 
redundancy or redundancies within a sequence that includes the string pattern 
located in the base sequence of a polynucleotide or polynucleotides in the 
database. 

While this invention has been described with an emphasis upon a preferred 
embodiment, it will be obvious to those of ordinary skill in the art that variations 
in the preferred composition and method may be used and that it is intended that 
the invention may be practiced otherwise than as specifically described herein. 
Accordingly, this invention includes all modifications encompassed within the 
spirit and scope of the invention as defined by the following claims. 
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SEQUENCE LISTING 
(1) GENERAL INFORMATION: 

(i) APPLICANT: Kumar, Rajan and Heaney, Paul 

(ii) TITLE OF INVENTION: METHOD FOR POLYNUCLEOTIDE 
SEQUENCING 

(iii) NUMBER OF SEQUENCES: 6 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: SARNOFF Corporation 

(B) STREET: CN 5300 

(C) CITY: Princeton 

(D) STATE: NJ 

(E) COUNTRY: USA 

(F) ZIP: 08543-5300 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible Pentium Pro — " 

(C) OPERATING SYSTEM: WINDOWS NT 

(D) SOFTWARE: Microsoft WORD 97 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(viii) ATTORNEY/AGENT INFORMATION: 
(A) NAME: Silveriu, John V. 



(B) REGISTRATION NUMBER: 34,014 

(C) REFERENCE/DOCKET NUMBER: SAR 12024PCT 

(i) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 609-734-2454 

(B) TELEFAX: 609-734-2673 

(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 
GGGGG 

(3) INFORMATION FOR SEQ ID NO:2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: 
TACATGTACTGCT 
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(4) INFORMATION FOR SEQ ID NO:3: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3: 
TAACATGTACTGCTT 15 

15 

(5) INFORMATION FOR SEQ ID NO:4: 

(i) SEQUENCE CHARACTERISTICS: 
2 0 (A) LENGTH: 12 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

25 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: 
ACATGTACTGCT 12 

30 

(6) INFORMATION FOR SEQ ID NO:5: 

16 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:5: 
CATTAAAGGGAAAAGCCCAGTCA 

(7) INFORMATION FOR SEQ ID NO:6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:6: 
CATAGAGCAGTCA 
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WHAT IS CLAIMED: 



1. A method of sequential step sequencing of a polynucleotide having x 
number of nucleotides comprising: 

5 (A) providing a single-stranded polynucleotide template and a first 
complementary primer having n nucleotides, wherein n is an integer greater than 
three; 

(B) causing the template and the primer to anneal, thereby forming a 

template-primer complex; 
10 (C) adding a template-dependent nucleotide polymerase and at least one 

nucleoside triphosphate or analog thereof having a label attached thereto. 

wherein the nucleoside triphosphate or analog includes a base selected from the 

group consisting of adenine, thymine, cytosine, guanine, and uracil; 

(D) determining whether a label is associated with the template-primer 
1 5 complex or which label is associated with the template-primer complex. 

2. The method of claim 1, further comprising removing unincorporated 
nucleoside triphosphate from the template-primer complex. 

3. The method of claim 1, wherein step (C) is limited to using one nucleoside 
triphosphate or analog thereof, further comprising: 

2 0 if no label is associated with the template-primer complex as determined in 

step (D), then steps (A) to (D) are repeated using another nucleoside triphosphate 
or analog thereof having a different base than that used previously in step (C), 
steps (A) to (D) being repeated until it is determined that a label is associated 
with the template-primer complex. 

2 5 4. The method of claim 1, further comprising 

(E) upon having determined which base was added to the first primer by 
exercise of step (E), a second primer is generated having n+y nucleotides, y being 
one if each nucleotide is added one at a time, or, if each nucleotide is added more 
than one at a time, y being the number of identical adjacent nucleotides, wherein 

3 0 the added nucleotide or nucleotides are at the primer's 3' end; and steps (A) to (D) 

are repeated with the proviso that the second primer is substituted for the first 
primer. 
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5. The method of claim 1, wherein, in step (C), the nucleoside triphosphate 
analog has a modification that precludes any subsequent addition of such analog 
after a first analog has been added to the primer. 

6. The method of claim 1, wherein the method is used to sequence 
polynucleotides adjacent to a poly-A tail of the template, further comprising the 
following steps prior to providing the first primer in step (A): 

(a) providing a single-stranded polynucleotide template and an initial 
complementary primer, the template having a poly-A sequence, and the primer 
being a poly-T or a poly-U primer; 

(b) causing the template and the primer to anneal, thereby forming a 
template-primer complex; 

(c) adding a template-dependent nucleotide polymerase and an 
nucleoside triphosphate or analog thereof including a base, the base being 
thymine or uracil, thereby forming an elongated initial primer. 

7. The method of claim 1, wherein the label is selected from the group 
consisting of a radioisotope, a fluorescent dye, a signal-generating enzyme, and a 
first ligand that specifically binds to a second ligand comprising a radioisotope, a 
fluorescent dye or a signal-generating enzyme. 

8. The method of claim 1, wherein the template-primer complex is attached to 
a solid surface. 

9. The method of claim 8, wherein the solid surface is a microparticle. 

10. The method of claim 1, wherein the template-dependent nucleotide 
polymerase is a DNA polymerase or an RNA polymerase or a fragment thereof 
having polymerase activity. ^ 

11. A method of identifying a polynucleotide or polynucleotides, comprising: 

(A) providing a primer complementary to a contiguous string of non- 
redundant nucleotides, said primer having a label attached thereto; 

(B) providing a single-stranded polynucleotide template; 

(C) causing the template and the primer to anneal, thereby forming a 
template-primer complex; 

(D) determining whether a label is associated with! the template-primer 
complex or which label is associated with the template-primer complex. 
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12, A method of identifying a polynucleotide or polynucleotides, comprising: 

(A) providing a base sequence of a string, the string being a 
superimposed non-redundant string pattern or a contiguous non-redundant 
string; and — 

(B) searching a computer database of polynucleotide base sequences 
using the base sequence of the string. 

13. The method of claim 12, wherein the string is a superimposed non- 
redundant string, further comprising providing a computer program for searching 
for the superimposed string pattern in the polynucleotide sequences, the computer 
program being capable of identifying a superimposed string pattern despite the 
presence of a redundancy or redundancies within a sequence that includes the 
string pattern located in the polynucleotide or polynucleotides in the database. 
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